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Recently, a phase transition has been discovered in the network community detection problem 
below which no algorithm can tell which nodes belong to which communities with success any better 
than a random guess. This result has, however, so far been limited to the case where the communities 
have the same size or the same average degree. Here we consider the case where the sizes or average 
degrees are different. This asymmetry allows us to assign nodes to communities with better-than- 
random success by examining their local neighborhoods. Using the cavity method, we show that 
this removes the detectability transition completely for networks with four groups or fewer, while for 
more than four groups the transition persists up to a critical amount of asymmetry but not beyond. 
The critical point in the latter case coincides with the point at which local information percolates, 
causing a global transition from a less-accurate solution to a more-accurate one. 


I. INTRODUCTION 

Community detection, the division of a network into 
well-connected groups of nodes with only sparser connec¬ 
tions between groups, has been the subject of vigorous 
research in a number of fields including physics, statistics, 
and computer science [T]. A string of recent discoveries, 
however, have revealed that there are fundamental lim¬ 
its to our ability to detect community structure mg. 
Using techniques from statistical physics and probabil¬ 
ity theory, it has been shown that there can exist net¬ 
works that possess underlying community structure and 
yet that structure is undetectable. In particular, for cer¬ 
tain classes of model networks it has been shown that 
there exists a sharp detectability threshold above which 
efficient algorithms for community detection exist, but 
below which no algorithm of any kind can classify nodes 
into their correct communities with success any better 
than a random guess—or even detect the existence of 
communities in the network—if given only the network 
topology as input. 

The simplest demonstration of this effect makes use of 
the stochastic block model, a probabilistic generative net¬ 
work model that allows one to create artificial networks 
with any number of communities of any size [3] . For net¬ 
works generated using this model the existence and lo¬ 
cation of the detectability transition has been rigorously 
proven for the case of two communities of equal size [6Hg. 
The transition is a continuous one, with the fraction of 
correctly classified nodes playing the role of order param¬ 
eter. When the number of groups is increased, the phase 
transition becomes more complicated, analogous to that 
of random constraint satisfaction problems m- For five 
or more groups (or four or more in the disassortative or 
antiferromagnetic case) there is a “hard/easy” threshold 
where the accuracy achievable by an efficient algorithm 
undergoes a first-order transition and jumps discontin- 
uously. Immediately below this point there is a regime 
where community detection is possible in principle, but 
is believed to require exponential time Hi]. 


These results are for the symmetric case where the 
groups have equal size or, more generally, equal average 
degree. In this case, every node has the same probability 
distribution of local neighborhoods, so that the local en¬ 
vironment of a node gives us no information about what 
community it belongs to. In this paper we investigate 
the less well-studied case where the groups have different 
sizes or average degrees, which is of obvious relevance to 
real networks. This case is harder to analyze than the 
case of equal groups. We tackle it using two approaches, 
both based on the cavity method of 1311]. In the first, we 
perform a perturbative expansion of the cavity method 
equations; in the second we consider the behavior of the 
equations under finite iteration. 

It is straightforward to see that having unequal groups 
makes community detection easier. When different 
groups have different average degrees we can use the node 
degree as a simple proxy for group membership. And 
making the group sizes unequal in general makes the av¬ 
erage degrees unequal too (as we will show), so again we 
can use degree as a proxy. Furthermore, by propagating 
degree-based estimates of group membership through the 
network using a message-passing (belief propagation) al¬ 
gorithm, we can improve on the accuracy of this initial 
classification, labeling nodes based not only on their own 
degrees but also on the degrees of their neighbors, their 
neighbors’ neighbors, and so on. Iterating the message¬ 
passing calculation repeatedly corresponds to increasing 
the radius of the network neighborhood from which we 
draw information, until the classification reaches a fixed 
point when all information has been taken into account. 

It is known that the classification provided by this fixed 
point (or if there are multiple fixed points, the one with 
the highest likelihood or lowest Bethe free energy—see 
below) is optimal, in the sense that no other algorithm 
for community detection can do a better job [g. In par¬ 
ticular, if the fixed point does a poor job of assigning 
nodes to groups—or if it fails completely—then no other 
method will return better performance and it is this ob¬ 
servation that allows us to say when the structure in the 
network becomes undetectable. 
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Using these methods, we show in this paper that for 
four or fewer groups the second-order detectability transi¬ 
tion of the equal-groups case disappears, but that for five 
or more groups the hrst-order transition, and the coexis¬ 
tence regime where several competing fixed points exist, 
persist up to a critical level of asymmetry. In all cases 
we can classify the nodes better than chance, no matter 
what the parameter values are, but while in some cases 
our final accuracy is a smooth function of the parame¬ 
ters, in others there is a sudden jump from low accuracy 
based on purely local information to high accuracy based 
on propagating information globally across the network. 

We note that this phenomenology is qualitatively simi¬ 
lar to the case of “semisupervised” community detection, 
where we are given the true labels of a small fraction of 
nodes in m, and also to the Franz-Parisi spin-glass 
model |13j . where each node has an external field point¬ 
ing it to a reference state. In these models, the known 
labels or external fields break the symmetry and provide 
local information which propagates under belief propaga¬ 
tion, causing the coexistence region to shrink and finally 
disappear at a critical point. However, the scenario we 
study here is different in that our local information comes 
directly from the topology of the network itself, without 
the need for any “metadata” or external held. 

In Sections m and cni we dehne the stochastic block 
model and describe in detail previous results on de¬ 
tectability and how they were reached. Then in Sec¬ 
tion |IV] we develop the theory for networks with groups 
of unequal size and degree, including series expansions 
around the limit of weak structure and optimal local 
classihers based on neighborhoods of a given radius. In 
Section |V] we present extensive numerical tests on the 
stochastic block model that conhrm the picture painted 
by our theoretical results. In Section [VT] we give our con¬ 
clusions. 


II. THE STOCHASTIC BLOCK MODEL 

The stochastic block model is a model for networks 
containing community structure. It can be used both in 
a forward direction for generating artihcial networks with 
tunable structure and in reverse for detecting the pres¬ 
ence of communities in network data by htting the model 
to the data. In this paper we do both: we use the model 
to generate test networks with known community struc¬ 
ture, and then attempt to detect that structure by fitting 
that same model to the network. This dual approach is 
central to understanding when community structure is or 
is not detectable, since there is no better way to detect 
the structure in a network (or any other data set) than to 
fit it to the very model used to generate that structure in 
the first place. As pointed out by Decelle et al. [3], this 
means that if we fail to detect the community structure 
in our networks by this method, then all other methods 
must also fail on the same networks. The structure in 
such networks can thus fairly be said to be undetectable. 


The definition of the stochastic block model is as fol¬ 
lows. Each of n nodes is assigned to one of q groups, with 
probabilities 71 ,..., 7 ,j of assignment to group I to g re¬ 
spectively. Thus 7 a is the expected size of group a as a 
fraction of n. Once the group assignments are chosen, 
edges are placed between node pairs independently at 
random with probabilities Pab that depend only on the 
groups a, b that a pair belongs to. If the diagonal ele¬ 
ments Paa of the matrix of probabilities are larger than 
the off-diagonal ones, the resulting network will have 
traditional “assortative” community structure in which 
edges are more probable within groups than between 
them. However, other types of structure are possible and 
are observed in certain real-world networks, including 
“disassortative” structure where edges are more common 
between groups than within them, or mixed structures 
in which different groups may be variously assortative or 
disassortative with respect to one another. 

In this paper we focus on the case of a sparse network 
with constant average node degree in the limit of large 
network size, meaning that the edge probabilities Pab 
scale as 1/n. Specifically we set Pab = Cabjn where the 
Cab are constants. Then the expected degree Ca of a node 
in group a is the sum of the probabilities of connection 
between it and all other nodes, averaged over all possible 
assignments of nodes to communities. Letting Si denote 
the group to which node i belongs, we have 

{si} i i i b b 

= '^Cablb- ( 1 ) 

b 

The sparse case appears to be representative of most real- 
world networks and also displays a richer phase transition 
structure in the community detection problem. 

A. Fitting the stochastic block model 
to network data 

In this paper we consider the following problem. An 
undirected network is generated by the stochastic block 
model for some choice of { 7 a} and {caf,}, and our goal is 
to find the best fit of the same model to the network data, 
so as to recover the community assignments planted in 
the network. 

In performing the fit, we will assume that the values of 
the parameters 7 a and Cab used to generate the network 
are known exactly. The only quantities we need to deter¬ 
mine by our fit are which nodes belong to which groups. 
This is a somewhat unrealistic assumption. In general, 
nothing is known beforehand, and one must learn the val¬ 
ues of the parameters as well as the group assignments. 
In some cases we can do this using an expectation- 
maximization algorithm [ainiiiHis]. However, our goal 
here is to understand the fundamental limits on our abil¬ 
ity to detect community structure and for this purpose 
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the simpler setup considered here is a useful one. If it 
is impossible to detect community structure when we are 
given the values of the parameters, then it will still be im¬ 
possible when we are not given them. Hence the accuracy 
we can achieve given the parameter values sets an upper 
bound on what we can achieve when the parameters are 
unknown. 

Given the parameters { 7 a} and {cab}, the optimal 
group assignments can be calculated by maximizing the 
likelihood that the observed network was generated by 
the model. In the case of sparse networks it can be 
misleading to focus only on the single assignment that 
maximizes the likelihood, which can result in overfitting 
of the data. Instead we focus on the posterior distribu¬ 
tion ^({si}) over group assignments, and especially the 
marginal probability of group membership for each node, 
i.e., the probability /i^ that node i belongs to group a: 

lJ-l^^f^i{Sz})Sa,s^, ( 2 ) 

{sH 

where Sa,b is the Kronecker delta. In particular, if our 
goal is to maximize the fraction of nodes labeled cor¬ 
rectly, the optimal strategy is to label each node i with 
its most-likely group, given by argmax^ 

The optimal (maximum-likelihood) value of the poste¬ 
rior distribution can be shown (via a standard derivation 
involving Jensen’s inequality) to be given by maximizing 
the quantity 

•^ = XI H X] ^^ab log Cab 

a i ab (ij) 

~ “XX^abCab - 

ab ij {s,} 

as a function of the distribution /r({si}). Here the no¬ 
tation J2{i j) denotes a sum over all edges {i,j) in the 

network, and is the two-node marginal probability 
that nodes i and j belong to groups a and b respectively: 

l^ab = X ■ (4) 

The quantity .if has the character of a free energy. Its 
maximization requires us to find a distribution fi whose 
one- and two-node marginals give a large value for the 
average log-likelihood of the observed network (the first 
three terms in .if), while also giving a large value for 
the entropy term — X]{s } log Tli® tradi¬ 

tional approach to this problem, borrowed directly from 
statistical mechanics, is to treat /r({si}) as a Gibbs dis¬ 
tribution over “states” {s^} whose Hamiltonian consists 
of (minus) the first three terms in .if (the “internal en¬ 
ergy”) and sample from this distribution using a Monte 
Carlo algorithm. However, obtaining good statistics on 
the marginals requires us to take many independent sam¬ 
ples, which is computationally expensive. 


An elegant alternative, better suited to our current 
aims, is the belief propagation method proposed recently 
by Decelle et al. [3]. Belief propagation focuses on the 
“belief” or “message” , which is an estimate of the 
probability that node i would belong to group a if node j 
were removed from the network (or, more precisely, if we 
lacked information about whether or not i and j have an 
edge between them). The removal of a node corresponds 
to the cavity method of statistical mechanics: it allows 
us to write down a set of self-consistent equations that 
must be satisfied by the beliefs thus [3]: 

CabMb] n 

(5) 

Here di denotes the set of neighbors of node i and di\j 
denotes that set exclusive of node j. The quantity Zi^j 
is a normalizing constant that ensures that ~ 

Zi^j = 'y ^ 7a GXp f — XX Cabf^b n X CabPb 

a ^ k b kGdi\j b 

( 6 ) 

These equations assume that i’s neighbors are indepen¬ 
dent of each other given its state s^, or equivalently, that 
i's neighbors are correlated only through their interaction 
with i. As a result, belief propagation is only exact on 
trees; on a finite graph with loops, it is merely an approx¬ 
imation. As long as correlations in the network decay 
with distance, however, it becomes exact in the limit of 
large size for a network that is “locally treelike,” mean¬ 
ing that almost all vertices have neighborhoods which are 
trees up to a radius of O(logn). Networks generated by 
the stochastic block model satisfy this condition in the 
sparse case considered here, and hence we expect belief 
propagation to give exact results in the large-n limit. 

Implementing belief propagation consists of solving 
Eq. § by simple iteration starting from an appropriate 
initial condition and iterating until the beliefs converge 
to a fixed point. The one-node marginal probabilities p\ 
can be calculated directly from the beliefs according to 

Ma = expX - ^ X Cabpi] n X CabAifc^*, (7) 

^ ^ ^ k b ' k^di b 

where Zi is a normalizing constant, 

Zi = '^Jaexp(--'^Yj^<^bPb') n 

a ^ ^ k b ^ k^di b 

( 8 ) 

The two-node marginals of Eq. Q can also be calculated 
from the beliefs. For pairs f,j connected by an edge, 

Mab = ^ Cab/i^Vb^* (9) 

where Zij is another normalizing constant: 

Z,,=Y,CabtV^pt'. (10) 

ab 
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In the sparse case, we can assume that pairs j not con¬ 
nected by an edge are independent, so that 


ij i i 
^^ab= 


( 11 ) 


To calculate the value of ^ itself, we can substitute the 
converged values of the one- and two-node marginals ob¬ 
tained from the belief propagation equations 0 and plj ) 
back into the log-likelihood, Eq. (|^. The final entropy 
term in ([^ requires an expression for the full joint pos¬ 
terior distribution ^({si}), which we assume takes the 
factorized form 




■ 1 —|- / \di — 1 ' 

n*(Kj 


( 12 ) 


where di is the degree of node i. (Again, this form is 
exact on trees, and asymptotically exact on locally tree¬ 
like networks in the limit of large size; on finite networks 
with loops it is only approximate, and indeed does not 
even sum to 1.) After some manipulation, one can then 
show that the converged value of .Sf, which is also equal 
to the log-likelihood, is 


if = ^ log Zij - ^ log 
(ij) * 


ab i 




with Zi and as in Eqs. Q and ( [10| ). This quantity 
(or, rather, minus this quantity) is called the Bethe free 
energy, and it can be shown [miis] that fixed points of 
belief propagation are stationary points of the Bethe free 
energy. In particular, there is a stable fixed point that 
maximizes if whenever /r takes the form (12). However, 
belief propagation often has many fixed points in addi¬ 
tion to this one, so it is possible for it to converge to a 
local optimum of if rather than the required global opti¬ 
mum. To get around this problem one typically runs the 
belief propagation calculation multiple times with differ¬ 
ent initial conditions and selects, from the fixed points 
found, the one with the highest log-likelihood (or the 
lowest Bethe free energy). 

In many regimes this approach works well. However, 
it can also happen that the global optimum has an ex¬ 
ponentially small basin of attraction—that is, the set of 
initial messages that would cause belief propagation to 
converge to it has exponentially small volume. In that 
case, finding it can be computationally difficult, which 
can lead to interesting behaviors, as we will see. 


III. DETECTABILITY TRANSITIONS 

Belief propagation is a fast and practical method for 
community detection in networks and has been employed 
extensively to fit the stochastic block model and other re¬ 
lated models to network data [am [MU]. It is also a 
powerful tool for the formal analysis of algorithm perfor¬ 
mance. By analyzing the fixed points of the belief propa¬ 
gation equations, Eq. ([^, we can make statements about 


whether the method is, or is not, able to find the commu¬ 
nities in a network. And since the maximum-likelihood 
fit performed by the algorithm is optimal in the sense de¬ 
scribed in Section [Hj if the belief propagation algorithm 
fails, i.e., if the fixed point with the highest likelihood 
does not give the correct communities, this implies (for 
locally treelike networks) that all other algorithms must 
also fail. Thus results for belief propagation tell us not 
just about one particular algorithm, but about all possi¬ 
ble algorithms for community detection. 

Arguments of this type allowed Decelle et al. [31 3] to 
show that there exist regions in the parameter space of 
the stochastic block model where community structure 
is undetectable by any means. Specifically, they showed 
that if the average degrees, Eq. 0 , are the same for all 
groups, there is a trivial fixed point where = /^a = 
7 a. If belief propagation settles at this fixed point, then it 
returns results no better than guessing node labels based 
on the prior probabilities 7 q. For example, in the special 
case where the q groups have equal size 7 a = 1/q, belief 
propagation concludes that all nodes are equally likely 
to belong to all groups, and assigns nodes to groups with 
accuracy no better than flipping a g-sided coin. 

The so-called hard/easy transition corresponds to a bi¬ 
furcation at which this trivial fixed point becomes unsta¬ 
ble. This transition is known in the spin glass literature 
as the de Almeida-Thouless line [33] , and in information 
theory as the Kesten-Stigum transition (331 [33] or the ro¬ 
bust reconstruction threshold [251ES] ■ Above this transi¬ 
tion, if we initialize belief propagation with random mes¬ 
sages, or even with just a small perturbation away from 
the trivial fixed point, it quickly moves away from that 
fixed point towards another, nontrivial fixed point which 
is well-correlated with the true community assignment. 
Thus detecting the community structure, and labeling 
the nodes with accuracy better than chance, is computa¬ 
tionally easy in this regime—belief propagation succeeds 
at the task quickly and reliably. 

The position of the hard/easy transition is relatively 
easy to compute in the well-studied special case where 
the groups have equal size and the parameters Cab of the 
stochastic block model take just two different values: 


^ab — 


Cin 

Oout 


if a = 6, 
if a 7 ^ 6. 


(14) 


If Cin is significantly greater than Coutj this choice gives 
us strong assortative structure, but as Cin approaches Cout 
the structure gets weaker. One might imagine that the 
structure would remain detectable, albeit with some sta¬ 
tistical error, so long as Cin > Cout; but this is not the 
case. Instead the trivial fixed point becomes stable when 


Cin Cout — 


where 


Cin 4“ {q l)Cout 


(15) 


(16) 


q 
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is the average degree of the network as a whole. 

When the trivial fixed point is stable, belief propaga¬ 
tion can show different behaviors depending on whether 
the stability is local or global, which in turn depends on 
the number q of groups. For g < 4 it is globally stable 
below the hard/easy transition, so that the community 
structure is completely undetectable (this is known rig¬ 
orously for q = 2 0 ). Belief propagation will always 
converge to the trivial point and return no information 
about the community structure. In this case the tran¬ 
sition is a pitchfork bifurcation where the trivial fixed 
point emerges continuously from the nontrivial one. If 
we define an order parameter Msi ~ l/^j equal to the 
average probability given to the correct label minus the 
fraction 1/q we would get right by chance, then this order 
parameter undergoes a classic second-order phase transi¬ 
tion from a nonzero value above the critical point to zero 
below it. 

By contrast, for g > 4 (or g > 4 in the disassortative 
case) there is a region immediately below the easy/hard 
transition where the trivial fixed point is locally stable, 
but not globally stable. In this regime there is at least one 
other nontrivial fixed point that is also locally stable and 
corresponds to accurate classification of the nodes into 
their groups. In this “coexistence region,” belief propa¬ 
gation can converge to either fixed point—and hence it 
may fail or succeed—depending on the initial conditions 
for the iteration. Unfortunately, it appears that the basin 
of attraction of the accurate fixed point is exponentially 
small, so that we will almost always converge to the triv¬ 
ial fixed point if we start with random messages. But 
if we have the luxury of exploring the entire space of 
messages, or performing an exponential number of inde¬ 
pendent runs of belief propagation, we can still find the 
accurate fixed point. And if the likelihood is higher at 
this point than at the trivial fixed point, then the algo¬ 
rithm that picks the solution with higher likelihood (as 
described above) would choose the accurate fixed point 
over the trivial one and label the nodes with good accu¬ 
racy. We would, however, need to perform exponentially 
many runs of belief propagation to achieve this result. 
Decelle et al. 1311] have conjectured, though it has not 
been proved, that in fact there exists no algorithm of any 
kind that will find the accurate fixed point quickly un¬ 
der these circumstances—specifically none that will find 
it in polynomial time. If this conjecture is correct then it 
implies the existence of a “hard but detectable” regime 
where community detection is possible in principle but 
computationally hard. (It is this regime that gives the 
easy/hard transition its name.) 

If one continues to decrease Cin — Cout, there comes a 
point at which the likelihoods for the two fixed points 
cross over and the trivial fixed point becomes favored 
even with repeated restarts. At this “condensation 
threshold” the system undergoes a first-order phase tran¬ 
sition, where the fixed point that dominates the Gibbs 
distribution changes from the accurate one to the trivial 
one. Below this point there is a “clustered” regime where 


many locally stable fixed points still exist, including the 
accurate one, but the algorithm that selects the solution 
with the highest likelihood will classify the nodes into 
their groups with success no better than chance. 

Thus, below the condensation point belief propagation 
no longer succeeds under any circumstances and the com¬ 
munity structure becomes information-theoretically un¬ 
detectable: no algorithm, even one that takes exponen¬ 
tial time, can perform better than chance. Finally, as we 
decrease Cm — Cout even further, there is a “spinodal” or 
dynamical transition where the accurate fixed point dis¬ 
appears altogether, and the trivial point becomes globally 
stable. 

The coexistence of more than one stable fixed point 
in the same parameter regime is a classic sign of a first- 
order phase transition. Indeed there is a close analogy be¬ 
tween the behavior of the community detection problem 
and a thermodynamic first-order transition. As described 
above, one can regard the log-likelihood as (minus) the 
free energy of a thermodynamic system, a g-state Potts- 
like spin system in this case whose spins are the com¬ 
munity assignments Si of the nodes. Fixed points of the 
belief propagation algorithm give us not just individual 
community assignments of nodes but the entire distribu¬ 
tion ^({si}). Thus they correspond, in thermodynamic 
terms, not to microstates but to macrostates, and com¬ 
peting fixed points correspond to coexisting phases of the 
system. The accurate fixed point corresponds to a ferro¬ 
magnetic phase which is correlated with the true group 
assignment, and the trivial fixed point corresponds to a 
paramagnetic phase. The condensation transition is the 
point at which the free energy branches corresponding to 
these two phases cross, making one phase thermodynam¬ 
ically favored over the other at equilibrium. 

The words “at equilibrium” are crucial here, implying 
that we have the luxury of sampling the entire state space 
of group assignments. Since the state space is of exponen¬ 
tial size as a function of n, this is typically not possible 
for a polynomial-time algorithm. Thus even if the accu¬ 
rate fixed point has a higher likelihood it may be difficult 
to find it. The situation is analogous to that of a glassy 
material: the lowest free energy of such a system may 
be attained in the crystalline state, but if that state is 
surrounded by a high free energy barrier—corresponding 
dynamically to having an exponentially small basin of 
attraction—then at reasonable timescales we will remain 
in the trivial paramagnetic state, and fail to find the 
true equilibrium. The hard/easy transition in commu¬ 
nity detection is the point at which the free energy barrier 
disappears, so that it becomes dynamically easy for the 
system to reach the ferromagnetic state and accurately 
detect the community structure. 

We can carry the physical analogy of a first-order tran¬ 
sition further. Suppose we start in the accurate (fer¬ 
romagnetic) phase, just above the hard/easy transition, 
and then slowly decrease Cin — Cout so that we enter 
the coexistence region. We do this in “adiabatic” fash¬ 
ion, making only incremental changes to the structure of 
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the network—adding, removing, or moving edges one by 
one—iterating the belief propagation equations to con¬ 
vergence after each change, starting from the previous 
fixed point. The net result will be that we stay at the 
accurate fixed point even as we pass the easy/hard tran¬ 
sition and enter the regime where that fixed point would 
be a priori hard to find. We will continue to follow the 
accurate point until it disappears and we shift to the 
trivial (paramagnetic) phase. At that point, we can if we 
wish start increasing Cin — Cout again, rising back through 
the coexistence region but now staying at the trivial fixed 
point until we once again pass the hard/easy transition, 
where the trivial fixed point destabilizes and we jump 
back to the accurate one, corresponding to spontaneous 
magnetization. In this way we can trace out a hystere¬ 
sis loop in the behavior of the system; the transitions at 
which the trivial and nontrivial fixed points become un¬ 
stable or disappear corresponding to the spinodal lines 
at the boundaries of the loop. 

While it is true that belief propagation rarely finds 
the accurate fixed point in the coexistence region when 
the beliefs are randomly initialized, it is still possible 
to find it if we initialize the beliefs in the right way. 
In Section 0 we show the results of numerical calcula¬ 
tions where the beliefs are initialized at the known true 
community assignments of the nodes s^, meaning we set 
= (5a,Si, where Sa,b is the Kronecker delta. This 
initialization places the beliefs sufficiently close to the 
accurate fixed point that the process reliably converges 
to it. In real-world applications one does not know the 
true assignments of the nodes to communities so this cal¬ 
culation is not possible—finding those assignments is the 
entire point of performing belief propagation in the first 
place—but we may still be able to find the accurate fixed 
point if we have some side information or “metadata” 
about the group assignment that allows us to guess suffi¬ 
ciently good initial values of the beliefs. This is roughly 
what happens in the semisupervised case mentioned in 
the introduction, where we are given the correct labels of 
a fraction of the nodes. This kind of information lowers 
the hard/easy transition, allowing us to find the accu¬ 
rate fixed point at lower values of Cin — Cout [HI [H] • In a 
similar way, we will see that making the groups unequal 
lowers the hard/easy transition and shrinks the coexis¬ 
tence region, until, at a critical amount of asymmetry, it 
removes the detectability transition altogether. 


IV. NETWORKS WITH UNEQUAL GROUPS 

The purpose of this paper is to understand how the de¬ 
tectability results reviewed in the previous section change 
when the community structure is asymmetric, i.e., when 
we go from equally sized groups to unequal ones. In fact, 
the key question is not whether the groups have unequal 
sizes, but rather whether they have unequal degrees. If 
they do then the trivial fixed point = 7 a no longer 
exists, and we can no longer identify the hard/easy tran¬ 


sition with a simple linear stability analysis. 

Here we explore two complementary approaches to this 
problem. In the first, we approximate the fixed point by 
a series expansion about the limit of weak structure. In 
the second we approximate it by performing only a finite 
number of iterations of the belief propagation equations. 


A. Series expansion 

In our first approach, we expand the equations for the 
case of unequal groups about the weak-structure limit, 
i.e., about the limit where Cm = Cout- That is, we choose 
unequal sizes 7 q for the groups then expand in powers of 
the strength Cjn — Cout of the community structure. This 
also results in different average degrees for the groups 
(which, as we have said, is really the crucial point): from 
Eq. 0 , the average degree Ca of a node in group a is 

Ca — ^ ^ ('ab'yb — Cout “t (cjn Cout)7a, (I^) 
b 

SO that nodes in larger groups (larger 7a) have higher de¬ 
gree on average whenever Cin > Cout- Thus we can use the 
node degrees as a guide to community membership. As 
we will see, the belief propagation equations employ this 
local degree information to estimate communities with 
success better than a random guess, and moreover they 
spread that information to neighboring nodes to improve 
the results still further. The calculation is as follows. 

First, note that the average degree in the network as a 
whole is 


C — ^ ^ 7aCa — Cout T (cin Cout)7, (I^) 

a 


where 


7 = 51 ' 


(19) 


is the expected size of the community to which a ran¬ 
domly chosen node belongs. Equivalently, 7 is the frac¬ 
tion of nodes we would assign to the correct communities 
purely by chance if we were to place the correct number 
nja of nodes randomly in each group a. 

We now expand around the case Cin = Cout by fixing 
the mean degree c, Eq. (18), and varying the difference 


^ — ^in Cout • 


( 20 ) 


This fixes the values of Cin and Cout uniquely to be Cm = 
c-l- (1 — 7 )e and Cout = c — ye or equivalently we can write 

Caf, = C-l- ((5ab - 7)e. (21) 


In the limit e —>■ 0, where Cin = Cout, there is no correla¬ 
tion between the community structure and the topology 
of the network. Thus the network data tell us nothing 
and the probability of a node belonging to any group a 
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is simply equal to the prior probability 7 a. Indeed, it is 
easy to check in this case that the sole solution of the 
belief propagation equations is = 7a. 

We expand about this point in powers of e thus: 

Mr^=7a(l + ar^e + ...) (22) 


result, whenever e > 0 there is no regime in which we do 
no better than chance. 

Specifically, since nodes in group a have degrees which 
are Poisson-distributed with mean Ca, Eq. (27) implies 
that the marginals are exactly equal to the posterior 
probabilities of the groups given the degree, since 


for some coefficients and expand the marginals sim¬ 
ilarly: 

Ma = 7a(l + OfaE -I- . . .). (23) 

Since = l and 7a = 1, we have 




(24) 


Substituting Eq. (221 into Eq. ([^ and keeping terms to 
first order in e, we get 




expf EE Cablb{l + a^e) 

^ k h 

X ^Cab7fc(l+ ab^*e)- (25) 

kGdi\j b 


The sum in the exponential is 

Cablb{^ + = Ca + e CabJbCyb 

b b 

— Ca C Cout ^ ^ (Cin Cout)'TaQ^a 

b 

= Ca + e^Ta^a = Ca + 0{e‘^), 
where we have used Eqs. 0, and ([^. Similarly, 


Pr[si = a I di] = ^ | s* = a] 

where 7 a = Pr[si = a] by definition and Zi = dil Pr[di] is 
the required normalization constant. This is the Bayes- 
optimal conclusion that we can reach about i’s group 
membership, given no information except its degree, or 
equivalently given only its radius-1 neighborhood in the 
network. 

That only the radius-1 neighborhood enters into this 
calculation is a result of the fact that, in the weak- 
structure limit where we treat e to first order, belief prop¬ 
agation transmits information only one step along the 
edges of the network before it reaches a fixed point. If we 
calculate the next order in the series, treating terms up to 
second order in e, we will find ourselves taking the radius- 
2 neighborhood into account, classifying nodes based on 
their own degree and the degrees of their neighbors, and 
so on. This suggests an alternative approach which we 
describe in the following section. 


B. Finite iteration of the 
belief propagation equations 


X] Cah7b(l + = Ca + e^ya^a^* = Ca -f 0{e^). 


Using 

again. 


these expressions in (25), along with Eq. (18) 
we get 


d 


— 

a 


la 

Zi^j 


e 


-Ca.„di-1 
'^a I 


(26) 


where Zi^j is the appropriate normalizing constant as 
usual. Notice that is independent of j at this order, 
meaning that a vertex sends the same message to each of 
its neighbors. 

Similarly, we can calculate the one-node marginal 
probabilities /i), from Eq. 0 and we get 

(27) 


This tells us that nodes with higher degree di will have 
a higher probability of being placed in groups where the 
average degree Ca is higher, while those with lower degree 
will have a higher probability of being placed in groups 
with lower average degree. In other words, the algorithm 
will divide the nodes according to their degrees. As a 


As discussed above, a series expansion of the belief 
propagation equations produces a set of approximations 
for the fixed point that depend on information from a 
neighborhood of increasing radius around the node of 
interest. This prompts us to consider an alternative ap¬ 
proach in which we look at the behavior of the belief 
propagation algorithm after a finite number of iterations 
of the update equations 0. Since each iteration corre¬ 
sponds to each node passing its current information to 
its neighbors, t iterations mean that each node receives 
information from its neighbors out to distance t. 

Suppose we start with messages derived from nothing 
but the prior on group assignments, i.e., = 7 a for 

all i,j,a, and apply belief propagation for a single step. 
After one iteration of Eq. 0 the new values of the beliefs 
will be 


Mr^(i) = 




(1) 




(29) 


where Zi^j(l) is the appropriate normalizing constant as 
usual and we have made use of Eq. 0. These values are 
identical to those derived from the first-order expansion 
of the previous section, Eq. (26). Similarly, from Eq. 0 , 














the one-node marginal probabilities are 

1 


mL(i) = 




7ae = Pr[si = 


(30) 


the same again as in the previous section, Eq. (27). And, 
as previously, this is the optimal Bayesian classification 
of the nodes based on their radius-1 neighborhoods in 
the network: that is, based only on how many neighbors 
they have, but without any further information about 
those neighbors—see Eq. (28). 

If we perform a second step of belief propagation, we 
get 




7a e 




-Cb dk-1 


(31) 


and 


K(2) = 


7a e 


n 




.(1) 


y] IbCabe 


-Ch„dk-1 


(32) 


Now the marginals depend both on j’s degree and the de¬ 
grees of its neighbors, i.e., on i’s neighborhood of radius 
2. And again this is the optimal Bayesian classification 
given this information and no other, as we can see by 
noting that if fc is a neighbor of i and is of type b then 
its so-called excess degree —that is, the number of neigh¬ 
bors k has in addition to i —is Poisson-distributed with 
mean c^. Thus 

p-Cb „dfc-l 

Pr[dfc \ k e di,Sk = b] = ^ . (33) 

Furthermore, the definition of the block model gives 


Pr[A: £ di \ Sk = b,Si = a] = Pab , (34) 

and so 

13 r r I ; ^ O’ l 7bPab 7bCab 

Pr Sfe = b\ k & di,Si= a\ = w-—- = -. (35) 

Lb' Ib'Pab' Ca 

Now, applying Bayes’ rule and summing over all possible 
types of Ps neighbors (which are unknown to us) gives the 
following probability that i is of type a, given Ps degree 
and those of its neighbors: 


Pr[si = a\di, {dk}] oc 7o Pr[(ii, {dk} | Si = a] 

= 7a Pr[di I Si = a] Pr[dfc \ k G di,Si = a] 




= la 


d^\ 




k£di 

=b\k G di,Si = a] 

k^di b 

X PT[dk \ k G di, Sk = 5] 

IbCab 


oc7ae”^“ Yl 7-f^rjy,y2lbCabe-‘''’c'^^>‘-\ 
kGdi ^ ^ '' b 


(36) 


which (after normalization) matches Eq. (32). 

These results extend naturally to any number t of it¬ 
erations: if we start with uniform messages and iterate 
belief propagation t times we get the Bayes-optimal esti¬ 
mate of Ps marginals based on its network neighborhood 
of radius t. Indeed, the belief propagation equations are 
equivalent simply to applying Bayes’ rule locally, updat¬ 
ing Ps marginal based on those of its neighbors with the 
assumption that Ps neighbors are independent of each 
other. This holds exactly on trees and, therefore, also on 
locally tree-like networks such as those generated by the 
stochastic block model, on the radius-t neighborhood of 
almost all vertices. Thus, iterating belief propagation t 
times is an asymptotically optimal algorithm for labeling 
nodes of a stochastic block model network based on local 
information up to t steps away in the network. 

Since we know that the local neighborhood carries in¬ 
formation about group membership in the case of asym¬ 
metric groups, this allows us to conclude that belief prop¬ 
agation, starting from messages equal to the prior proba¬ 
bilities, will always label the nodes better than a random 
guess. It is by no means guaranteed, however, that a 
local calculation of this kind must give the best possi¬ 
ble answer. It is possible that some nonlocal calculation 
could do better and indeed this is exactly what happens 
in the coexistence region for the case q > In this re¬ 
gion the local calculation does do better than a random 
guess, but there exists another fixed point that does bet¬ 
ter still. Finding this fixed point, however, requires us 
to start belief propagation very close to it, meaning we 
have to give the algorithm fundamentally nonlocal infor¬ 
mation, simultaneously choosing the correct values of the 
beliefs out to arbitrary distances. 


V. NUMERICAL RESULTS 

The results of Section |TV] suggest that it should be 
possible to classify nodes into the correct groups with 
a success rate better than chance for all networks with 
Cin > Cout when group sizes are unequal or, more gener¬ 
ally, when average degrees are unequal. In this section, 
we test this prediction with numerical experiments on 
networks generated by the stochastic block model. As 
we will see, our expectations are borne out by the sim¬ 
ulations and a number of other phenomena are revealed 
as well, particularly concerning the picture for networks 
with larger numbers of communities. As described in 
Section when q > A there are, for certain parame¬ 
ter regions, two stable fixed points. When the size or 
average degrees of the groups are equal, the values of 
the messages at one of these fixed points (the “trivial” 
fixed point) give no information about community mem¬ 
berships while those at the other give a group assignment 
strongly correlated with the true one, and there is a first- 
order phase transition between the two. When the group 
sizes are unequal or when the groups have different aver¬ 
age degrees a random guess according to the prior prob- 
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abilities 7 a achieves an accuracy of 7 , Eq. (191, but the 
calculations of the previous section suggest that even the 
less good of the two fixed points achieves an accuracy sig¬ 
nificantly better than this. Thus in this regime we expect 
to see a (first-order) phase transition between “good” and 
“better” performance, but no regime in which the algo¬ 
rithm fails altogether. 

In order to measure the effect of unequal group sizes 
and degrees, we explore a two-parameter space of block 
model networks. The first parameter is the difference e = 
Cin ~ Cout between the densities of in-group and between- 
group connections, as defined previously in Eq. (20). 
The second parameter, which we denote 6, measures the 
amount of asymmetry in the groups, i.e., how far we are 
from having equally sized groups. We define the group 
sizes 7 a to be 


7 a=^(l + ^Ca), (37) 

where the quantities (^a are of order 1 and sum to zero, 
Sa Ca = 0. This choice satisfies the normalization con¬ 
straint 7 a = 1 and allows us to go from equal-sized 
groups at (5 = 0 to unequal ones for d > 0. For the par¬ 
ticular simulations performed here, we consider equally 
spaced group sizes with 

Ca = a - 5(9 + !)■ (38) 


to more than one group. Moreover, it can assign a node 
to a group even if the probability it belongs there is only 
a little above 1/q, so for large q the most probable as¬ 
signment may be quite unlikely to be correct. An alter¬ 
native measure that takes these issues into account is the 
marginal overlap 


= (41) 

i 

which is equal to the total fraction of nodes that would 
be assigned to the correct communities if communities 
were assigned randomly in proportion to their marginal 
probabilities. 

Note that these two definitions of the overlap have 
different values in the weak-structure limit where the 
marginal probabilities are equal to the group sizes /r), = 
7 a. In the case where each node is assigned to its most 
likely group we end up putting all nodes in the largest 
group in the weak-structure limit, which means that the 
fraction of correctly assigned nodes is 

Q = max 7 a = - [ 1 -f 5(9 - l)d] (42) 

a Q 

for the choice of group sizes in Eq. ( [38| ). In contrast, 
the value of the marginal overlap in the weak-structure 
limit is 


For g = 3, for example, we would have groups of size | 
and (1 ± d)/3. Varying S also varies the average group 
degrees. From Eq. 0 we have 

Ca = Cout q(^ ^Ca)) (39) 

so the groups have different average degrees whenever 
^ > 0 . 

To quantify our success (or lack of success) at identi¬ 
fying the planted community structure, we calculate the 
overlap between the planted and detected communities, 
equal to the fraction of nodes assigned to their correct 
communities by the algorithm. There is, however, some 
ambiguity about how the overlap is defined, given that 
belief propagation does not uniquely assign nodes to sin¬ 
gle communities but rather gives us the marginal prob¬ 
abilities /Xa with which the nodes belong to each com¬ 
munity. Conventionally, one removes this ambiguity by 
assigning each node to the community it has the highest 
probability of belonging to. Then the overlap is 

Q = - Vd(si,argmax^^), (40) 

n ^ a 

I 

where Si is the planted community of node i as previously, 
S{i,j) is the Kronecker delta, and argmaXa/(a) denotes 
the value of a that maximizes /(a). 

This measure has some problems, however. It throws 
away a lot of information contained in the marginals 
when a node has a significant probability of belonging 


= 7 '= “[1 + n («^“ ( 43 ) 

i 

A. Performance of belief propagation 

Figure shows the overlap (left) and marginal overlap 
(right) for networks with q = 2 groups. For these calcu¬ 
lations we generated networks with n = 100 000 nodes, 
average degree c = 3, and various values of the param¬ 
eters S and e, then ran the belief propagation algorithm 
starting from random initial messages. 

For the case d = 0, where the two communities are of 
equal size and equal average degree, we see that there 
is, as in [H j H, a phase transition at Cc = 2,y/c = 3.46... 
(see Eq. ISh from a regime where the overlap is | by ei¬ 
ther definition—no better than a random guess—to one 
with overlap strictly greater than i. For 6 > 0, however, 
where the communities have unequal size and unequal 
average degree, we see that the algorithm does better 
than chance whenever e > 0 ; moreover, the detectability 
transition disappears, i.e., the overlap is a smooth func¬ 
tion of e. Figure [^provides an alternative visualization of 
the behavior of the system. Here we show the overlap Q 
(left) and the convergence time (right) in the e-S plane. 
These figures make the lack of a sharp detectability tran¬ 
sition particularly clear: the only place where Q is not 
a smooth function, and the only place where the conver¬ 
gence time diverges, is at the equal-group detectability 
transition, when d = 0 and e = Cc. 
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FIG. 1: The overlap Q, Eq. (401, and the marginal overlap Eq. (411, for belief propagation on networks generated by 
the stochastic block model with q = 2 groups, n = 10® nodes, average degree c = 3, and group sizes as given in Eqs. ( |37[ ) 
and (381 a function of e = Cin — Cout for various values of 5. Increasing 5 corresponds to greater differences between the group 
sizes and average degrees. The dashed lines in the left panel are the expected values in the weak-structure (i.e., e = 0) limit, 
Eq. (421. Note how the sharp detectability transition disappears for 5 > 0; both overlaps are smooth functions of the block 
model parameters. 



FIG. 2: The marginal overlap (left) and log (base 10) of the convergence time (right) as a function of e and S for networks 
with q = 2 groups, size n = 10®, and average degree c = 3. The overlap is a smooth function except at the detectability 
transition for equal-sized groups, which occurs when 5 = 0 and e = 2^/c. This is also the only place where the convergence 
time diverges. Thus for g = 2 and 5 > 0 there is no detectability transition. 


We have also performed tests on networks with three 
and four groups and find similar behavior. For more than 
four groups we expect qualitatively different behavior as 
described above—a first-order transition with a coexis¬ 
tence region below the hard/easy transition, character¬ 
ized by the simultaneous coexistence of two stable fixed 
points. Unfortunately, clear numerical confirmation of 
this behavior is harder to obtain. The coexistence region 
is difficult to see because the range of e it spans is quite 
narrow for assortative networks. As observed in Ref. [1], 
however, the behavior is clearer in the disassortative case. 


and particularly in the fully disassortative case of a net¬ 
work that has connections only between different groups 
and none within groups. (Community detection in this 
case is equivalent to a “planted graph coloring problem.” 
In computer science a q-coloring of a graph is a color¬ 
ing or labeling of the vertices with q different labels such 
that no vertices with the same label have an edge be¬ 
tween them. Our problem is equivalent to one in which 
we generate a random graph that we know to be colorable 
in this way by first assigning the labels and then adding 
edges only between unlike labels. Then we discard the 
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FIG. 3: The overlap Q for belief propagation on fully disassortative networks generated by the stochastic block model with 
q = 5 groups, n = 10® nodes, and various values of 5 as indicated, as a function of average degree c. In the left panel we 
initialize the beliefs with uniform random values; in the right panel we initialize them with the true (planted) communities. 
For small values of S there is a range of c where the latter initialization gives a higher overlap, indicating a second and better 
fixed point with a small basin of attraction. 




5 10 15 20 ■ 5 10 15 20 

c c 

FIG. 4: The overlap Q (left) and log (base 10) of the convergence time (right) as a function of c and <5 for totally disassortative 
networks with q = 5 groups and n = 10® nodes. The beliefs are initialized with random values in both panels. The first-order 
hard/easy transition is visible as a line in the c-5 plane where the overlap jumps discontinuously and the convergence time 
diverges. The height of the discontinuity decreases with increasing 5 until we reach a critical point at which it vanishes at a 
second-order transition. Above this point the overlap is a smooth function of c and 5 and there is no detectability transition. 



labels and try to recover them again based only on the 
structure of the graph.) Since Cin = 0 in a totally disas¬ 
sortative network, our parameter e is just —Cout in this 
case while the average degree, Eq. (18), is c = Cout(l —7)- 
Thus there is no need for separate parameters c and e: 
fixing the average degree automatically fixes e. 

Recall that both of the fixed points in the coexistence 
region are expected to give better-than-random classifi¬ 
cation of the nodes into communities, but one is expected 
to perform better than the other. The two points can be 
considered perturbations of the “accurate” and “trivial” 
fixed points of the equal-groups case. Roughly speaking, 
the perturbed “near-trivial” fixed point corresponds to 


inference with local information, starting with the prior 
and applying belief propagation a few times, while the ac¬ 
curate fixed point corresponds to finding a self-consistent 
solution with global correlations and considerably higher 
accuracy. We expect both fixed points to be locally sta¬ 
ble, but for the accurate fixed point to have an expo¬ 
nentially smaller basin of attraction than the near-trivial 
one. 

To test this hypothesis we perform two separate sets of 
experiments. In the first we initialize belief propagation 
with uniformly random messages (up to normaliza¬ 
tion) . With this random initialization belief propagation 
typically converges to the near-trivial fixed point, unless 
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FIG. 5: Phase diagram in the c-S plane for the q = 5 fully 
disassortative case described in the text. The blue curve 
shows where the near-trivial fixed point becomes unstable 
(also called the Kestum-Stigum or easy/hard transition); the 
green curve is the point at which the accurate fixed point 
disappears. The gray area between the two is the coexis¬ 
tence region in which both fixed points are stable and belief 
propagation can converge to either depending on the initial 
conditions. The red curve is the condensation transition at 
which the likelihoods cross over; the black dot is the critical 
point above which there is no phase transition behavior at all. 


we are above the hard/easy transition at which this point 
becomes unstable. In the second set of simulations we 
initialize belief propagation with messages corresponding 
to the true communities that we planted in the network, 
= Sa^si- With this planted initialization belief prop¬ 
agation typically converges to the accurate fixed point, 
unless we are below the spinodal transition at which this 
point disappears. Thus, above and below the coexistence 
region we expect these two sets of experiments to con¬ 
verge to the same solution, while within the coexistence 
region we expect them to give different solutions, with 
the random initialization giving a lower overlap than the 
planted one. 

Figure shows the overlap Q as a function of c for 
fully disassortative networks with q = 5 and n = 100 000, 
with random initial messages (left) and the planted ini¬ 
tialization (right), run on the same set of networks in 
each case. As the figure shows, the results are indeed 
as hypothesized above. For low and high values of c the 
two initializations give the same results, as they do also 
for sufficiently large values of <5. For small values of 6, 
however, there is a sizable range of values of c where 
the overlap achieved by belief propagation with random 
initial messages is significantly lower than that with the 
planted initialization, indicating the coexistence of two 
competing fixed points. For comparison, the hard/easy 
transition for fully disassortative networks in the equal- 
group case [3] is at c = (g — 1)^ = 16. 

Figure again gives a different view of the results. 


with the left panel showing the overlap achieved by be¬ 
lief propagation with random initial messages in the c-S 
plane. There is a clear curve visible in this plot where the 
overlap changes discontinuously as the near-trivial fixed 
point becomes unstable and belief propagation jumps to 
the accurate fixed point. Exactly on this curve, the near- 
trivial fixed point is marginally stable, causing the con¬ 
vergence time to diverge, as shown in the right panel. 
Thus there is a hard/easy transition in this case, even 
though there was none for q = 2, and it is a first-order 
transition. As the asymmetry increases with S, however, 
the size of the discontinuity shrinks and past a certain 
point (about 6 = 0.12) it vanishes altogether. The criti¬ 
cal point where it vanishes is a second-order phase tran¬ 
sition and beyond this transition the overlap is a smooth 
function of the block model parameters. 

This behavior is reminiscent of a first-order transition 
in a spin system with an external field, where the order 
parameter shows a discontinuity as a function of temper¬ 
ature but the size of the discontinuity decreases and then 
vanishes at a critical value of the external field mm- 
In the present case the “temperature” comes from the 
average degree and/or the strength of the community 
structure, and the “external field” comes simply from 
the topology of the network. 

Figure shows the behavior observed in our experi¬ 
ments as a single phase diagram in the c-S plane. The 
blue curve represents the hard/easy transition at which 
the near-trivial fixed point becomes unstable; to the right 
of this curve only the accurate fixed point is stable, so 
all calculations converge to a high overlap, regardless of 
whether they are initialized randomly or with the planted 
communities. The green curve shows the spinodal line 
where the accurate fixed point disappears; to the left of 
this curve both initializations converge to the near-trivial 
fixed point, yielding a relatively low overlap. In between 
lies the coexistence region (gray), which extends up to 
the critical point at 5 ~ 0.12; for S larger than this there 
is no phase transition. Finally, the red curve is the con¬ 
densation transition mentioned in Section III the line at 
which the likelihoods (or equivalently the Bethe free en¬ 
ergies) of the two fixed points cross. To the left of this 
line the algorithm that finds the fixed point with highest 
likelihood will choose the near-trivial fixed point over the 
accurate one, and hence fail to detect the communities 
no matter how much time is allowed. 

Note that, even to the right of the hard/easy transi¬ 
tion, there can be locally stable fixed points other than 
the accurate one: when e is large enough or S is small 
enough, there are also fixed points corresponding to var¬ 
ious permutations of the groups. These permuted fixed 
points have lower likelihood than the one correspond¬ 
ing to the planted community structure, but for large e 
they can have fairly large basins of attraction, causing 
belief propagation with random initial messages to fall 
into them fairly often. (This is the source of some of the 
fluctuations visible in Fig. [^) Nevertheless, we can find 
the accurate fixed point in this case by performing a rea- 
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sonable number of independent runs of belief propagation 
and choosing the fixed point with the highest likelihood. 


B. Belief propagation with a finite number of steps 


In this section we investigate the behavior of belief 
propagation when run for a finite number of steps, as 
opposed to iterating it until it converges to a fixed point. 


As discussed in Section IV B iterating belief propagation 
t times makes optimal use of local information up to t 
steps away, but ignores information further than that. 

In Fig. 1^ we show the overlap Q (left) and marginal 
overlap Q/j (right) for g = 2 groups. In each panel, 
the black curve shows the overlap of the fixed point to 
which belief propagation converges if we continue iterat¬ 
ing it. Below that, we show two curves corresponding 
to iterating belief propagation for t = 1 and t = 2 steps 
where (as in Section IV BI the messages are initially set 
equal to the prior probabilities = la- As we it¬ 

erate, using information about the network from larger 
and larger neighborhoods, the accuracy of belief propa¬ 
gation improves and the curves approach the overlap for 
the fixed point from below. We also show two further 
curves where the beliefs are initialized with the planted 
assignment = 5a,si and these curves approach the 

overlap of the fixed point from above. (The fixed point 
is the same for either initialization, since for q = 2 and 
5 > 0 there is no detectability transition.) 

The curves with t = 2, for either initialization, already 
give quite a good approximation to the final overlap when 
e is either very low or very high. Only in the interme¬ 
diate region, close to the position of the detectability 
threshold for equal-sized groups (which in this case is at 
Ec = 2\/8 ~ 5.66) is the approximation still poor after 
two iterations. This agrees with previous observations 
that belief propagation converges quickly everywhere ex¬ 
cept in the vicinity of the transition |3]. It also shows 
that, for q = 2 groups, local information is enough to al¬ 
low belief propagation to quickly approach optimal clas¬ 
sification as the neighborhood radius increases. (See [57] 
for recent rigorous results on external helds or “side in¬ 
formation,” showing that local algorithms also succeed 
in that setting.) 

Figure shows analogous results for q = 5 groups in 
the fully disassortative case, for parameter values that en¬ 
compass the coexistence region where both fixed points 
are stable. In the coexistence region, initializing the be¬ 
liefs with the prior probabilities—and thus using only lo¬ 
cal information—converges to the near-trivial fixed point, 
while initializing with the planted communities converges 
to the accurate fixed point. This behavior is visible in the 
figure, with two lines in each panel (in black) showing the 
converged overlaps. Outside the coexistence region these 


two lines agree but inside it they do not, showing that in 
this region there is a fundamental difference in the power 
of local vs. global information. 

The remaining curves show the results for t = 1, 2,4, 8 
iterations. With the prior initialization these results fail 
to register the first-order transition, instead following an 
analytic continuation of (an approximation to) the near- 
trivial fixed point. Similarly, with the planted initializa¬ 
tion we miss the spinodal transition and instead follow 
an approximate continuation of the accurate fixed point. 
As a result, convergence from the “wrong” initialization 
to the final overlap is quite slow both above and below 
the coexistence region. 


VI. CONCLUSIONS 

We have studied the detection of community structure 
in networks generated by the stochastic block model, a 
standard model of networks with well-defined clusters of 
nodes. Previous studies have revealed the presence of 
a detectability transition in such networks, below which 
the communities are undetectable by any means. In this 
paper we study the case where the symmetry between 
the groups is broken by having groups with unequal sizes 
or unequal average degrees. 

We find that for the well-studied case of two groups, 
the detectability threshold disappears when the groups 
are unequal, making the accuracy a smooth function of 
the parameters of the model. On the other hand, for 
q > 4 (or g > 4 in the disassortative case), where the 
detectability transition is first-order in the equal-groups 
case, the transition persists up to a certain amount of 
asymmetry. Before this point is reached there is a coex¬ 
istence between two competing solutions—one with low 
accuracy (but still better than chance) based on local 
information, and the other with higher accuracy based 
on global information. As the amount of asymmetry in¬ 
creases, the coexistence region shrinks and finally disap¬ 
pears at a critical point, beyond which there is no sharp 
transition. We conjecture that this local/global distinc¬ 
tion may be a generic phenomenon in statistical inference 
whenever a symmetry is broken, both in networks and in 
other kinds of data. 
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